MindReader: Querying Databases Through Multiple Examples

نویسندگان

  • Yoshiharu Ishikawa
  • Ravishankar Subramanya
  • Christos Faloutsos
چکیده

Users often can not easily express their queries. For example, in a multimedia/image by content setting, the user might want photographs with sunsets; in current systems, like QBIC, the user has to give a sample query, and to specify the relative importance of color, shape and texture. Even worse, the user might want correlations between attributes, like, for example, in a traditional, medical record database, a medical researcher might want to nd \mildly overweight patients", where the implied query would be \weight/height 4 lb/inch". Our goal is to provide a user-friendly, but theoretically solid method, to handle such queries. We allow the user to give several examples, and, optionally, their 'goodness' scores, and we propose a novel method to \guess" which attributes are important, which correlations are important, and with what weight. Our contributions are twofold: (a) we formalize the problem as a minimization problem and show how to solve for the optimal solution, completely avoiding the ad-hoc heurisy Part of this work was done while this author was vising University of Maryland and Carnegie Mellon University. z This work was supported by NSF IRI-9625428. Also, by the National Science Foundation, ARPA and NASA under NSF Cooperative Agreement No. IRI-9411299. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 24th VLDB Conference New York, USA, 1998 tics of the past. (b) Moreover, we are the rst that can handle 'diagonal' queries (like the 'overweight' query above). Experiments on synthetic and real datasets show that our method estimates quickly and accurately the 'hidden' distance function in the user's mind.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keeyword Search in Databases

Querying using keywords is easily the most widely used form of querying today. While keyword searching is widely used to search documents on the Web, querying of databases currently relies on complex query languages that are inappropriate for casual end-users, since they are complex and hard to learn. Given the popularity of keyword search, and the increasing use of databases as the back end fo...

متن کامل

Querying and computing with BioCyc databases

We describe multiple methods for accessing and querying the complex and integrated cellular data in the BioCyc family of databases: access through multiple file formats, access through Application Program Interfaces (APIs) for LISP, Perl and Java, and SQL access through the BioWarehouse relational database.

متن کامل

Parallel Visual Information Retrieval in VizIR

This paper describes how parallel retrieval is implemented in the content-based visual information retrieval framework VizIR. Generally, two major use cases for parallelisation exist in visual retrieval systems: distributed querying and simultaneous multi-user querying. Distributed querying includes parallel query execution and querying multiple databases. Content-based querying is a two-step p...

متن کامل

Multi-Scale Partitions: Application to Spatial and Statistical Databases

We study the impact of scale on data representation from both the modelling and querying points of view. While our starting point was geographical applications, statistical databases also address this problem of data representation at various levels of abstraction. From these requirements, we propose a model which allows: (i) database querying without exact knowledge of the data abstraction lev...

متن کامل

Issues in Querying Multi-media Databases

Introduction The amount of information available to users, especially in the web, is growing exponentially month by month. To nd useful information in the web, users normally have to rely on some sort of keyword matching searches, and/or navigate their way through the list of pointers to objects in diierent data repositories. This is not enough to satisfy the needs of the ever more sophisticate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998